arXiv:1505.07915vl [math.ST] 29 May 2015 


Estimation of the parameter of a dynamically selected 
population for two subclasses of the exponential family 


Morteza Amini^‘arid Nader Nematollahi * 

t Department of Statistics, School of Mathematics, Statistics and computer Science, 
College of Science, University of Tehran, P.O. Box 14155-6455, Tehran, Iran 
^ Department of Statistics, Allameh Tabataba’i University, Tehran, Iran 

June 1, 2015 


Abstract 

We introduce the problem of estimation of the parameters of a dynamically selected 
population in an infinite sequence of random variables and provide its application in the 
statistical inference based on record values from a non-stationary scheme. We develop 
unbiased estimation of the parameters of the dynamically selected population and 
evaluate the risk of the estimators. We provide comparisons with natural estimators 
and obtain asymptotic results. Finally, we illustrate the applicability of the results 
using real data. 
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1 Introduction 

The problem of estimating parameters of selected populations has wide practical appli¬ 
cations in estimation of experimental data in agriculture, industry and medicine. Some 
of the real world applications of this theory are the problem of estimating the average 
yield of a selected variety of plant with maximum yield (Kumar and Kar, 2001), estimat¬ 
ing the average fuel efficiency of the vehicle with minimum fuel consumption (Kumar and 
Gangopadhyay, 2005) and selecting the regimen with maximal efficacy or minimal toxicity 
from a set of regimens and estimating a treatment effect for the selected regimen (Sill and 
Sampson, 2007). 
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The problem of estimation after selection has received considerable attention by many 
researches in the past three decades. Interested readers are referred to, for example, Gib¬ 
bons et al. (1977) for more details. Some other contributions in this area include Sarkadi 
(1967), Dahiya (1974), Kumar and Kar (2001), Misra et al. (2006a,b), Kumar et al. (2009) 
and Nematollahi and Motammed-Shariati (2012). For a summary of results, as well as a 
list of references until 2006, see Misra et al. (2006 a,b). 

In this paper, we introduce and develop the problem of estimation of the parameters 
of a dynamically selected population from a sequence of infinite populations which is not 
studied in the literature, according to the best of our knowledge. Let X\ , X -2 , ■ ■ ■ be a 
sequence of random variables where Xi is drawn from population Ilj with corresponding 
cumulative distribution function (cdf) Fg i (.) and probability density function (pdf) 

The traffic volume trend, daily temperatures, sequences of stock quotes, or sequences of 
estimators of interior water volume in a dam reservoir are examples of such sequences. 

Suppose we want to estimate the parameter of the population corresponding to the 
largest value of the sequence X\. W>, ■ ■ ■ yet seen, that is 

e \n] = 0 Tn, 

where T\ = 1, with probability one, and for n > 1 


T n = nrin{j; j > T n _i]Xj > X Tn _ x }, 


or similarly the parameter of the population corresponding to the smallest value of the 
sequence Xj. X- 2 , ■ ■ ■ yet seen, that is 



where T[ = 1, with probability one, and for n > 1 


T'n = min {j;j > T n ^ x \Xj < X Tn _ x }. 

We want to estimate 0 ^,, and similarly the lower ones 6^. This happens for example, 
when we want to estimate the largest value of traffic volume or stock quotes yet seen, the 
temperature of the coldest day or the largest volume of the coming water into the dam 
reservoir, up to now. 

For simplicity, we denote 0^ by (9[ n j hereafter. We may write 

OO 

o [n] = '^e j i j (x 1 ,x 2 ,...), (i) 

j=n 


where 


I,=I j (X 1 ,X 2 ,...) 



max Xk < X Tn l < Xj 

T n -i +1 <k <j -1 



o.w. 


= J(max{Jf fe ; T n _i + l<fc<j-l}< X Tn _ x < Xj). 


( 2 ) 
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The statistics U n = Xx n and L n = Xjv are called upper and lower records, respectively. 
In the sequence X\, X 2 , ■ ■., the sequences of partial maxima and upper record statistics are 
defined by M n = max{X] .X?,... ,X n } and U n = Xt u = Mt u , respectively, where T\ = 1 
with probability 1, and T n+ 1 = min{j; Mj > Mt„}, for n > 1. The record statistics U n 
could be viewed as the dynamic maxima of the original random variables. So, we call the 
problem of estimating 6t n i as the estimation of the parameter of a dynamically selected 
population. 

There is a vast literature on records for iid as well as non-stationary random variables. A 
thorough survey of available results, until 1998, is given in the book of Arnold et al. (1998). 
More recent articles on record values include, among others, Amini and Balakrishnan (2013, 
2015), Doostparast and Emadi (2013), Salehi et al. (2013), Ahmadi and Balakrishnan 
(2013, 2010), Psarrakos and Navarro (2013), Raqab and Ahmadi (2012), Zarezadeh and 
Asadi (2010), Kundu et al. (2009) and Baklizi (2008). 

This problem is related to the so-called general record model. The geometrically in¬ 
creasing populations, the Pfeifer, the linear drift and the F a record models are some of 
the generally used record models. The basics of non-stationary schemes for the record 
values are due to Nevzorov (1985, 1986) and Pfeifer (1989, 1991), who considered the so- 
called E Q -scheme, that is the sequences of independent random variables with distribution 
Fk(x) = (F(x)) dk , k = 1, 2,..., where F is a continuous cdf and d^s are positive param¬ 
eters. Further generalization of the F a -scheme was suggested by Ballerini and Resnick 
(1987). Although non-stationary schemes could be employed in the most general setting, 
the special case of improving populations is usually of special interest. Alternative non- 
stationary schemes include geometrically increasing populations, linear trend and Pfeifer 
models. 

In all the above models, strict assumptions are made on the sequence of parameters 
1- For instance, in F a record model, the sequence of the parameters is assumed to 
be known or depend on a fixed unknown parameter. In the linear drift model, a linearly 
increasing population is assumed as the underlying population. However, certain natural 
phenomena may behave otherwise. For example, an earthquake is produced by a natural 
phenomenon which has a pivotal parameter that varies based on an unknown model. In 
order to predict extremely destructive earthquakes, a very important question is on the 
value of the parameters which cause a new record in the sequence of earthquakes? This 
motivates us to study the problem of dynamic after-selection estimation. 

The rest of this paper is organized as follows. The theoretical results of the dynamic 
after-selection problem, consisting unbiased estimation of the parameters of the model as 
well as unbiased estimation of the risk of the estimators are presented in Sections 2 and 3. In 
Section 4, we compare the proposed estimators with some natural estimators. Asymptotic 
distributional results for studying the limiting behavior of the risks of the estimators are 
studied in Section 5. Finally, a real data example is considered in section 6 to illustrate 
the applicability of the results. 
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2 Minimum variance unbiased estimation 

Let 6 = (01,02, ■■ ■), X = (X 1 ,X 2 ,...) and hx(0) be a random parameter (a function 
of X and 6). Suppose that hy^(6) is estimated by <f(X). Following Lehmann (1951), the 
estimator <5(X) is said to be risk unbiased for hx(0) under the loss function L(hx(d), <f(X)), 
if it satisfies 

E e(L(hx(d), <5(X))) < E g(L(h K (0'), <5(X))), W + 6. (3) 

Under the squared error loss (SEL) function 

L(h x (0),S(X)) = (h x (0) - <5(X)) 2 , 
the condition ((3l) reduces to 

E 0 («5(X)) = E e (h x (0)). (4) 

In this section, we use the U-V method of Robbins (1988), to find Uniformly Minimum 
Variance Unbiased (UMVU) estimator of #[ n ] under the two models 1 and 2, presented 
below. 

Model 1: Let X\,X 2 , ■ ■ ■ be a sequence of independent absolutely continuous random 
variables with pdf 

f(x i ;0 i ) = c(x i )0- p e~ s ^/ di , (5) 

where S(Xi) is a complete sufficient statistic with the Gamma(p, ^-distribution. Some 
well-known members of the above family are: 

1. Exponential^*), with p = 1, S(xi) = Xi and c(x*) = 1; 

2. Gamma(p, 0j), with S/x*) = x* and c(x*) = x^ 1 /T(p)\ 

3. Normal(0,of), with 0i = of, p = 1/2, S/x*) = xj/2 and c(x*) = (27r)" 1 / 2 ; 

4. Inverse Gaussian(oo, A*), with 0* = 1/A*, p = 1/2, S/x*) = 1/(2 Xi) and c(x*) = 

(2 ^ 3 )' 1/2 ; 

5. Weibull^j, (3), with known j3, 0i = , p = 1, S(x {) = xf and c(x*) = /3xf _1 ; 

6. Rayleigh(/3j), with 0* = j3f, p = 1, S(xi) = x 2 /2 and c(x*) = x*. 

To estimate 0[ n ] in the family of distributions (0, we first consider the estimation of 
0[ n ] under the Gamma(p, ^^-distribution with pdf 

f(xi\0i) = fflm xT X exp{x*/0 ?: }, * = 1,2,--- . (6) 

By using the U-V method of Robbins (1988), we have the following lemma (see also Vel- 
laisamy and Sharma, 1989). 
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Lemma 1 Let X\, X 2 , ■ ■ ■ be a sequence of independent random variables with densities 
defined in Let Uj(x) be a real-valued function such that for j = 1,2,--- , 

(i)Eg[ |iij(X)|] < 00 , \/e 

(H) fg J Uj(x 1 , ■ ■ ■ , Xj-i,t, Xj + i, ■ ■ ■ )t p 1 dt < 00 , V Xj > 0. 

Then the functions 

1 r Xj 

vfiX) = ——j- / Uj(Xi, • • • , Xj_i,t, Xj+i, ■ ■ - )t p 1 dt, j = 1 , 2 , - - - , 

X j Jo 

satisfy 

E 0 [v j (X)]=E g [6 j u j (X)], J = 1,2,--- . 

The next result obtains the unbiased estimator of under the SEL function, for the 
Gamma(p, Of) distribution with the pdf of Xi as in (| 6 ]h 

Theorem 1 For the Gamma(p,0i) distribution with the pdf of Xi as in ©, an unbiased 
estimator of 9^, under SEL function, which satisfies (j4j) with hx.(0) = 9[ n ]> 

(7) 

where U n is the n th upper record value of the sequence X\, X2 ,.... 

Proof From dU), ({2]) and Lemma 1, an unbiased estimator of 9^, under SEL function, 
based on X\ , X 2 , ■ ■ ■ is given by 


00 00 . „Xj 

^i(x) = E^( x ) = E^t / 0 


) dt 


]=n ]=n ] 

where Ij(Xi,X2, ...) is defined in ©. Thus, 

I(max{X k ; X Tn _ 1+ i < k < j - 1} < U n -i < Xj) 

j=n 


Vi(X) = E 


X: 


■p -1 


rXi 


I tP- 1 dt 

U n -1 

n — o n _ 1 U^ 

P 


X 


uf - ur 


-1 


pUP 


1 - 


U n - 


£4 


□ 


To find an unbiased estimator of under the Model 1 with the pdf of Xi as in (}5|), 
let Yi = S(Xi ) ~ Gamma (p,0j), i = 1,2,..., Y = (Vi, V 2 ,...) and y = ( 3 / 1 , y 2 , - -.). Then, 
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by replacing Xi with Yi = S(Xi) in Theorem [H an unbiased estimator of #[ n i, under the 
SEL function, for the general family ©, can be obtained as 


E 2 (X) 



( 8 ) 


where U. % is the n th upper record value of the sequence Yi,Yj,.... 

For a monotone function S(.) (available in all of the above examples, except in the 
normal distribution), U^ can be obtained simply as S(U n ) for an increasing S and as 
S(L n ) for a decreasing S. For example, for the Rayleigh(d*)-distribution, an unbiased 
estimator for /3r n ] is 


/3[n] 


¥(-(%#)) 



u 2 - u 2 


n —1 


Model 2: For Xi,i = 1,2,---, consider two families of distributions, the first with Xi 
having the survival function 

F ei (x) = l-F di (x) = (G(x)) e i\ (9) 

and the second with Xi having the cdf 

F 6i {x) = {G{x))^\ (10) 

in which G(x) is a cdf, free of 9{, and G(x) = 1 — G(x). We assume G to be known. 
These are called proportional hazard rate and proportional reversed hazard rate families, or 
simply F a models in the context of record values. Some well-known members of the above 
families are: 

1 . Exponential^,;), a member of © with G(x) = e~ x , x > 0 ; 

2 . Rayleigh(#j), a member of Q with G(x) = e -3 ^/ 2 , x > 0 ; 

3. Beta(0~ 1 ,1), a member of (fT 0 |) with G(x) = x, 0 < x < 1; 

4. Pareto^” 1 ,/?), a member of © with G(x) = (3/x, x > (3, 

and 

5. Burr(a, 0 ” 1 ), a member of © with G(x) = (1 + x“) -1 , x > 0 . 

By making use of U-V method of Robbins (1988) for the family ©, we have the 
following lemma. 

Lemma 2 Let X\, X 2 , ■ ■ ■ be a sequence of independent random variables with survival 
function defined in m- Let Uj (x) be a real-valued function such that for j = 1,2,--- , 
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(i) E e [\ Uj (X)\} < oo, V0 

(ii) f-ioUjixi,--- ,Xj-i,t,Xj + i,- ■ ■ )h{t)dt < oo, V ij > 0, 

in which h = g/G is i/ie hazard function of G and g is the corresponding pdf of G. Then 
the functions 

^( x ) = / Uj(X ir -- j = 1,2,-■■ , 

J — OO 

satisfy 

Ee[v j (X.)]=E 0 [O j u j (X)], j = 1,2, ■ ■ ■ . 


Proof For one component problem (i.e., a single random variable Xj, j > 1), let v{x) = 
u(t)h(t) d t. Then, we have 


9 jE(u(Xj)) 


/ -too _ ± 

u(x)[G{x)\ i ~ 1 g(x ) dx 

-OO 

/ +oo 

u(x)Fe-(x)h(x) dx 

-oo 

/ +oo ir r+oo 'i 

u(x)h(x ) | J dF 0j (y) j dx 

/ +oo py /*+oo 

/ u{x)h{x) dx dFgAy) = / !/(x)dK(i). 

-oo J—oo J—oo 


For the sequence Xj, X 2 ,..the result follows by a similar calculation. 


□ 


The next result gives the unbiased estimator of 0[ n ], under SEL function, for the general 
family (J9]) . 

Theorem 2 Assume G to be known and let H = — log G be the cumulative hazard function 
of G. For the general family an unbiased estimator o/0[ n ], under the SEL function, is 

V 3 (X) = H(U n ) - H(U n -i). (11) 


Proof From (JT|) , (( 2 |) and Lemma 2 , an unbiased estimator of @[ n ] is given by 

00 OO 

E V ^ X ) = E j J h(t)I j (Xi,X 2 , • • • , Xj. u t, X j+1 , ■ ■ ■) dt 


V 3 (X) = 


J=n j=n 

e {r h{t) a 

j=n J 

xI(max{X k -, X Tri _ 1+ i < k < j - 1} < U n -i < Xj) 
H(U n ) - H(U n -i). 


□ 
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Remark 1 Similarly, for the family (1101) . an unbiased estimator for 9y n y, under the SEL 
function, is 


V4(x) = R{U n ) — R(U n -i), 

where R = log G is the cumulative reversed hazard function of the known cdf G. 

Remark 2 Note that (X\ , X-i, • • •) is a complete sufficient statistic for (# 1 , 02, ■ ■ ■ )• Hence, 
the above unbiased estimators of 6y n y are indeed UMVU estimators of 6y n y 

3 Estimation of the Risks 

To compare the UMVU estimator with other estimators, we need to compute the risk 
function of the proposed estimators. 

Under the SEL function, the risk of an estimator V is 

R(E e [n] ) = E(V 2 ) + E(0f n] ) - 2 E (VQ [n] ). 

The UMVU estimators obtained in Section 3 are functions of (U n ,U n - 1 ). Suppose we 
want to estimate the risk of an estimator of 6y n y which depend on X only through U n and 
U n - 1 , i.e. V = V(U n ,U n - 1 ). Then, we have the following results, under Models 1 and 2, 
respectively. 

Theorem 3 Under the Model 1 and the SEL function, an unbiased estimator of the risk 
of an estimator V = V(U^, U ^_\) of 9y n y is 

jS dt 

W{U*,Uti) = V\U^Uti) - 2 "- 1 - 

\^n ) 

| {uzy +1 - {uti) p+1 - (p+1) (uLiY - uL 1 ) 

pip + 1 ) (u^y - 1 


Proof From Lemma 1 with V 

OO 

E (»w) = E 9 i E ^( Y )) 

j—n 

00 

= E E h'( Y )]- 

j=n 
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= S(Xi), we have 

OO 

= ^0 j E[u j (Y)} 

j=n 




where 


-;oo 


Therefore 


~^=T [ s p l v j (Y l ,..., Yj- 1 , s, T J+ i,...) da 

Y/ Jo 

^=T l 3 s P_1 {-^ V 1 /^, • • .,^-,, 1 ^+ 1 ,...) dt} da. 


Tl 

'5 

II 

H 

Wf r e-'itis 

^ Y p_1 Jus us 

j=n 1 J JU n-1 JU n-1 

= E 

i f u " r , 

- T / / t p 1 dt ds 

( U nf Ju^JuS^ 


= E 

'(^n) P+1 - (pti) p+l - (p + 1) (eO P (Etf - C/ n S _r) 

p(p + l)(E/£) 

p-i 


Furthermore 


E (0 [ n ] V ( U %, uti )) = Y J 9 p { I j { Y ) V ( Y v U *_ i )) 


j=n 
oo 

y E 

^ Y p_± Jo 

j=n J 


F 1 jo’^V^Un-l) 


x .. •, Y j+1 , ...) dt] 


= E 


rU s 


( US )”-' Jut, 


£/“_,) dt 


Which completes the proof. 

An immediate corollary of Theorem [3] is as follows. 


□ 


Corollary 1 Under the Model 1 and the SEL function, an unbiased estimator of the risk 

of 



{ u *) p+1 { uti ) p+1 - (p + 1 ) {uLiY ( u % - ULi ) 

p(p+1) 
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Theorem 4 For the general family (|9|), and under the SEL function, an unbiased estima¬ 
tor of the risk of an estimator V = V(U n , U n -\) of is 


W(U n , U n -i) 




rU n 

-2/ hMVfaUn-i)) dt. 

Ju n _i 


Proof From Lemma 2 and using similar argument as in the proof of Theorem [3l we have 

OO 

E ( 0 [n]) = 'Z0]m« X ))) 


J=n 

oo 


X> E 

j=n 

°° r ,-X 


r Xi 


h(t)Ij(X 1,.. ...) d t 


]=n 


= E 

= E 

= E 


r A j ps 

/ h(s) h(t)Ij(X 1 ,...,Xj- 1 ,t,Xj +1 ,...) dtds 

U — oo J — oo 

rUn rs 

/ h(s) / h(t) d t d s 
JUn — l JU n — 1 

H 2 {U n ) - tf 2 (C/ n _l) 


- H(U n -i)(H(U n ) - H{U n -i)) 


(H(U n ) - H{U n - 1)) 2 


Furthermore 

OO 

E(0 [n] E(C/ n , U n -!)) = ^fljE(/j(X)y(Xj,F B -i)) 


j=n 

oo 


00 / r x i 

e- a 

j=n 




x Ij (Ai,..., , t, ,...) di) 


= E 


/•£/« > 

/ h(f)E(t, £4-i) dt 

JUn — 1 > 


This completes the proof. 

An immediate corollary of Theorem [4] is as follows. 
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Corollary 2 For the general family ([9]) and under the SEL function, 
(i) an unbiased estimator of the risk of 


C 3 = H(U n ) - H{U n . i) 
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W 3 (U n , U n . i) = ^(H(U n ) - H(u n - 1 )) 2 ; 

(ii) the risk ofV 3 is 

R(H(U n )-H(U n - 1 ),e [n] ) = E(^ n] ). 

Remark 3 The results for the general family (1101) can be obtained by replacing H(-) with 
R(-) = logG(-) in Theorem [4] and Corollary [2j 

Remark 4 Since (X\,X 2 , ■ ■ ■) is a complete sufficient statistic for ( 6 \, 62 , ■ ■ ■), the above 
unbiased estimators of R(V, 6 [ n ]) are indeed, UMVU estimators of R(V, 0[ n ]). 

The following result presents the distribution of the unbiased estimator in the family (}9j). 

Lemma 3 In the general family ([9]), the following identities hold: 

(i) For every n > 1 and y > 0, 

OO 

Pr (H(U n ) - H(U n -i) >y) = Y e-^Pr (T n = j); 

3 =l 

(ii) For every k > 2, n\ > 77-2 > • • • , > 1 and y\,..., >0, 

k k k 

Pr(f|{R(C/ ni )-R(C/ ni _i)> yi })= £ n^ p r(n^=^»- 

i= 1 jl<— <jfci=l i= 1 

Proof Let U* = H(U n ) and X* = H(X n ), n > 1. We only prove part (i). Part (ii) is 
proved in a simillar way. Using the fact that X* ~ Exponential^*) and the lack of memory 
property of the exponential distribution, 

Pr(^ - U*_, >y) = Pv(X* Tn - > y) 

= Y Pr(X* - X* > y\T n = j, T n . x = i) 

i<j 

x Pr(T n = j, T n _ 1 = i) 

= Y Pr (^ - X i > v\X*j > X !) F r(T„ = j, T n _! = i) 

i<j 

= Y [ Pi < x j ~ x> y\ x *3 > a; ) Pr ( r - = j, T n -1 = *) 

i<j 

x fx* (x) dx 

= Y f Fr ( X j > y) Fl i T n = j,T n -i = i)fx*{x ) dx 

i<j J 

OO 

= Ye- y/0j PT (Tn=j), 

3 =1 
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which is the required result. 
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4 Inadmissibility of the natural estimator of 0 ^ 

For the general family with pdf (|5|). we have 

e (s(Xi)/ P ) = e t . 

Thus, a natural estimator for 9[ n 1 , for this family of distributions is /p. For the general 
family with the survival function ([9|) , we have 

E (H(Xi)) = e it 

which candidates H(U n ) as a natural estimator of 6[ n 1 . So a risk comparison of the natural 
estimators with UMVUEs of 9 [ n ], for both families of distributions is considered. 

The following Corollary of Theorem 3] states that, under Model 2, the UMVUE domi¬ 
nates the natural estimator. 


Corollary 3 For the general family ([9]) and under the SEL function, we have 


R (H(U n ), 9 [n] ) > R (H(U n ) - H{U n - 1), 9 [n] ). 

Proof First, we have 

OO 

E{H{U n -{)9 [n] ) = 

j=n 

= E ( H(U n -1 ) jr [ Xj h(t)Ij(X r,..., X^ u t, X j+1 ,...) d t 


j=n J ° 

rUn 


= E H(U n -1 ) / h(t) d t 
\ JU n -i J 

= E {H{U n -i){H(JJ n ) - H{U n - 1 ))). 


Consequently, 

R (H(U n ), 9 [n] ) - R(H(U n ) - H(U n -i),9 [n j) 

= 2F,(H(U n )H(U n -i)) - E(F 2 (C/ n _!)) 

- 2E (H(U n ^)9 [n] ) 

= 2E(H(U n )H(U n - 1 )) - E(F 2 (C/ n _ 1 )) 

- 2E (H(U n ^)(H(U n ) - H{U n - 1))) 
= E (H 2 (U n . i)) > 0. 
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This completes the proof. 


□ 


However, under Model 1, no explicit results can be obtained for domination of the 
UMVUE or the natural estimator with respect to the other, since we have similarly 

R(F 2 (X),0 w )-R (uS/p,6 [n] ) 

(uLif p - 2 (uLiY ju^Y + ( u*y -1 (uLiY - uLi) 

P 2 (US) 2p ~ 2 

To compare the UMVUE and the natural estimator under Model 1, we run a simulation 
study, which is described in the following section. 

4.1 Simulation study 

We assume Xj ~ Gamma(p,^), i = 1,2,.... To compare the risks of the UMVUE 
0' n ] = tX ^1 — ( V ) with that of the natural estimator 0^ = for n = 2,3,4, 
p = 0.5, 2, we consider three different models for the sequence of parameters as follows: 

Model 1 (An stochastic, positive error auto-regressive model): 

di = Zi6i_i + ei, €i ~ d exp(l), Z t ~ d U( 0,1), i> 1, 8o = 0; 

Model 2 (An stochastic Geometrically increasing population): 

6i = Ci{ 1 + Di/ioy- 1 , Ci, Di ~ U{ 0,1); 

Model 3 (White noise model): 

di = 10 + £i , Ei ~ 1V(0,1). 

The simulated bias and risks of the estimators are tabulated in Table [T] As one can observe 
from Table [Tl the simulated risks of 0} , are less than those of 0? ,. Also, biases and risks 
are increasing in n, except the risks of 6 ^ ,, under the white noise Model 3. 

5 Asymptotic results 

From Corollary [2l the risk of the UMVUE of 0[ n ] for the general family ([9]), V 3 = H{U n ) — 
H(U n - 1 ), is 

R(^3,0[n]) = \mH{U n ) - H{U n ^)) 2 ) 

= - Utify 
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Table 1: Simulated bias and risk of the UMVUE and the natural estimator of #[ n ] under 
three different models from gamma distribution for different values of n and p. 


Model 1 

P 


n 

2 

3 

4 

0.5 


Risk 

9.440638 

14.75326 

18.54895 


*[n] 

Bias 

1.524951 

4.747217 

9.160673 



Risk 

23.1851 

84.08421 

209.7748 

2 

*[n] 

Risk 

3.224838 

6.856674 

10.66222 


% 

Bias 

0.5978639 

1.782032 

3.29696 



Risk 

3.886525 

12.33907 

27.96078 

Model 2 

V 


n 

2 

3 

4 

0.5 


Risk 

2.224561 

53.26235 

1785.95 



Bias 

0.7864656 

2.342428 

6.334353 



Risk 

5.501025 

94.40079 

2499.64 

2 

0[n] 

Risk 

0.5376576 

2.314486 

19.68881 


i! 

Bias 

0.3038626 

0.72345 

1.335166 



Risk 

0.6209572 

2.658157 

19.79643 

Model 3 

V 


n 

2 

3 

4 

0.5 

71 

Risk 

161.3311 

146.8202 

125.2359 


i! 

Bias 

13.682 

30.34559 

47.98977 



Risk 

685.7074 

1851.813 

3543.839 

2 


Risk 

64.93679 

74.52687 

82.06017 


% 

Bias 

7.023687 

13.47608 

19.60645 



Risk 

131.9781 

297.2568 

537.5641 
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where U.^ is the n th upper record value form the sequence Y\,Y 2 -, ■ ■with Y) ~ Exp(0j). 

Hence, asymptotic joint distribution of U .and U^_\ would be useful for computing 
the risks of the estimators. The following theorem proposes the required asymptotic dis¬ 
tribution. 


Theorem 5 Let a(ri) and b(n) be such that 
G n (a(n ) + b(n)x ) —>• 'L(x), 

as n —>• oo for all real x, where T is one of the three extreme value cdfs (see Resnick, 1987, 
p. 38). Then, for the family ([9]) with Of 1 = oo, and letting U* = 1 ' - - —- and 

\2-^i=1 u i ) 

U n - 1 -a{Y J T -, 0 _1 ) , , ,, 

—- a-i. , we have, for all y > z, 


TT* — 

u n -1 — 


ME iLi e r 1 ) 


(y. *0 -»• ^ » v 


> Z, 


as n —>• oo, where ip is the corresponding pdf of T. 

Proof. Letting 5(z) = I^}=i <5® (i) = ^® =1 Oj 2 and is the i the order statistic of 

Xi, ..., Xfc. Using the independence of pQ_i : j, X^) and T n under the F a model (Ballerini 
and Resnick, 1987), we have 

OO 

fu n ,u n -i(y,z) = ^2fx i:i ,x i _ 1:i (y^ z \ T n = i)P(T n = i) 

i=n 

oo 

= ' 52 fx i;i ,x i - 1 .Ay> z ) F ( T n = *) 


i—n 

oo 


E P ( T - = *) E [G'(0)] 5 «- 2 ^ 1 0r 1 5 ( y ) 5 ( 2 

5(2) (* 


i=n 

oo 


= E 


E P ( T « = *)[G(^)] 5 «- 2 (5(z)) 2 5 (2/)5(^) 

[G(z)] s ^)- 2 (5(T n )) 2 < 7 (y)< 7 ( 2 


1 - 


(sm 


i - 


s ( 2 ) (r„ 


(5(T n ))2 


Consequently, since <7 satisfies the Von-Mises conditions (see Resnick, 1987) and 
1 , as n —>• 00 , we have 

fui,ui_ x (y, Z) = E [[G(6(5(Tn))z + a(S(Tn)))) s ^-\S(T n )b(S{Tn))f 
x g(b(S(Tn))y + a(S(Tn))) 

5(2) (T n 


1 _ s (2) (M> 
1 (sRF 


*(*) 


x g(b(S(Tn))z + a(S(Tn))) 

i>(y) ^( z ) 


1 - 


(5(T n )) 2 


*(y) *(z) 
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Thus, the proof is complete. □ 

When G is standard exponential distribution, we have a(n ) = logn, b{n) = 1 and 
'h(x) = exp{— exp(— x)}. Therefore, letting U* = U n — log(X)5u *) an d ^n-i = U n -\ — 
log(Z£i as n —> oo, we have 

2 ) —>■ exp (—(2 + y)) exp{— exp(—z)}, y > z, 

and consequently for each y and z, as n —> 00 , we have 

i^^z) exp{-e- min ^)}[l + /(y > z)(e " 2 - e"*)]. 

However, [/* and U*_ 1 depend on the unknown 6. The following result solves this 
problem using the fact that under the F a model, n -1 / 2 (log(S(T n )) — n) converges in law 
to the standard normal distribution (see Nevzerov, 1995). 

Theorem 6 Under the family @ with G(x) = 1 — exp (—a;), x > 0, with the assumptions 
of Theorem 0, and letting T* = n^ 1 ^ 2 (log(S(T n )) — n), as n —>• 00 , for fixed y, z and t, we 
have 

Fu : ,u;_ v T-(y,z,t) -> $(t)exp{- 6 - m “(^)}[l + I(y > z)(e~' - e”*)], 
where <h is the cdf of the standard normal distribution. 

Proof As in the proof of Theorem [5l we have 

OO 

F U*,Ul_i,T*{y, z ,t) = ^F Xi;i - lo B (S(i)), Xi -i ti -log(S(i))(i/,z) 

i=n 

x I{n~ l/2 {log{S(i )) - n) < f)P(T n = i) 

OO 

= ]Texp{-e- min(y ’ 2) + 0(l/S(i))} 

i=n 

X [1 + I(y > z)(e~ z - e~ y )\ 

x I{rT l / 2 {log{S{i )) - n) < t)P(T n = i) 

—>• exp{-e- min( ^^}[l + I(y > z){e~ z - e~ y )]<f>(t), 
as n — > 00 , which is the required result. □ 


By Theorem [6l we have 


U n - n 


n 


< x, 


U n -1 - n 


n 


< y. 


= p 


U n - log S(T n ) + log S(T n )-n 
y/n 


< x, 


U n -1 - log S(T n ) + log S(T n ) - n 


n 


< y 


d>(min{x,y}) = min{<h(a:), <h(y)}, 


( 12 ) 
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as n —>• oo, which is the upper Frechet Hoeffding bound; see, e.g., Frechet (1951) or Nelsen 
(1999, p. 9). The following Corollary, presents an approximate formula for the risk of 
UMVUE of 0[ n ], for the family 

Corollary 4 For the family ([§]). under the assumptions of Theorem [3] we have 
R(H(U n ) - H(U n -i),Q[ n ]) = o(ri), as n-) oo. 

Proof From (1121) and by Hoeffding’s theorem, 

U%_ x -n u**-n\ .. _ (U^-i ~ n U** - n\ 


lim Cor 

n—>oo 


n 


n 


= lim Cov 

71 —> OO 


n 


n 


min(<h(x), <J>(y)) — 4>(x)<h(y) dx d y. 

The above double integral can be simplified by algebraic manipulations as 

1+ /*><») d*-/«.)<!-2*(,))d« = l, 


in which cf is the pdf of the standard normal distribution. Thus, we have 

1 / / / a _ on _ 1 1 I n 1 \ 2 


l„_u ^ 1 - 


n 

as n —>• oo. 


R(f/"-t/"_ 1 ,0 N ) = -E 




n 


0 , 


□ 


6 Rainfall data: an illustrative example 

In this section, we utilize the data set which represents the records of the amount of annual 
(January 1-December 31) rainfall in inches recorded at Los Angeles Civic Center LACC 
during the 100-year period from 1890 until 1989, presented by Arnold et al. [1998, p. 180]. 

A member of the F a model (Model 2) with survival function as in @, that is the 
Rayleigh distribution with cdf 

F(I) = 1 — exp { ~n3.23 }' I>4 ’ (13) 

is well-fitted to the data. The p-value for two-sample Kolmogorov-Smirnov test is 0.3333. 
Figured] shows the empirical distribution function of the rainfall data and the cdf in ()13D . 
Thus, we take 

H(x) = (x — 4) 1 ' 9 , 

to be the known cumulative hazard rate function of the base distribution G(x) = 1 — 
exp { — (x — 4) 19 } , x > 4. 

Suppose that the only observations are the sequence of upper record values as follows: 
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Empirical Cumluative Distribution 



Figure 1: Empirical cdf of the rainfall data. 


12.69 12.84 18.72 21.96 

23.92 27.16 31.28 34.04. 


We consider two hypotheses: 

Hq : (Stationary model) X\ . X 2 ,... ~ Fq(x) = 1 — exp j —j , x > 4; 

H\ : (Non-stationary model) X, ~ Fo i (x) = 1 — exp | 1,1 H j) —| , x > 4, z = 1,2,... and 
Xs are independent. 

Under Hq, 0^ = 9, n = 1,2,..., with probability 1. Hence, 9 [ n ] = H ^ n ' > = — is 

H ( Un > 1 = 1L with unbiased estimator 


the UMVUE of 9 [n] = 9. Also, R(9 [n] ,9 [n] ) = Var 

n/fl a \ _ \H(Un)] 2 _ (Uu- 4) 3 ' 8 
' M ? ^[n]/ n 2 (n+l) n 2 (n+l) 

Under 77 1 , 0[ n ] = H(U n ) — H(U n - 1 ) = (U n — 4 ) 19 — (U n -1 — 4 ) 1 - 9 and the unbiased 
estimator of its risk is R(0 [b] ,£I [ b] ) = ( ^)-^-i)) a = 

Figure [2] shows the values of 9 r n i and their corresponding 3 -cj region 


fmax jo,0 [n] - 1.5y / 6^7^)| , 0 [n] + l-hy^R^], 0[ n ])^ , 


under Hq and H\. 
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1 2 3 4 5 6 7 8 


n 


Figure 2: Estimates path (solid line) and 3-cr regions (upper and lower dashed lines) of 8[ n ], under the stationary 
(straight lines) and non-stationary (zigzag lines) assumptions, for the rainfall data. 


To test Hq against H i using the record sequence we propose the scale invariant test 
statistic 


T = 


1 


n — 1 


n 


E 



(14) 


Since, under Ho, all (9us are equal, the null hypothesis is rejected for large values of T. 

We use the fact that under Ho, the random variables H(U n ) — H(U n - 1 ), n> 2 are iid 
exponential, to deduce that under Ho, 



where = stands for the identically distributed and Z \,..., Z n ~ Exp( 1). 

Deriving the exact distribution of T is far from reach. However, one can estimate the 
distribution quantiles of T using a Monte Carlo simulation study. 

To generate random variables identically distributed as T, one may generate an iid sam¬ 
ple form standard exponential, namely, Z\,... , Z n , and return T = Y^i=i — l) • 

Table [2] presents the simulated values of a-critical values of T, t n (a), for n = 2,. .., 10, 
and a = 0.01, 0.025, 0.05,0.1, which are generated using R.14.1 package with 10 5 iterations. 
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Table 2: The critical values of the test statistic m 



a 

n 

0.01 

0.025 

0.05 

0.1 

2 

8645.63 

1368.24 

326.02 

64.61 

3 

19003.73 

3113.96 

723.25 

164.76 

4 

27929.12 

4681.26 

1093.01 

264.36 

5 

37018.72 

6343.56 

1529.97 

355.73 

6 

49769.98 

7707.69 

2007.57 

456.78 

7 

64315.21 

9211.87 

2388.29 

563.19 

8 

70630.56 

10801.06 

2698.59 

655.51 

9 

73372.31 

11655.77 

3131.44 

747.15 

10 

92847.93 

13727.53 

3500.69 

883.22 


The hypothesis Hq is rejected at level a as 

T > t n (a). 

For the rainfall data we obtain T = 279.14, which is less than t%{ 0.05) = 2698.59. 
Therefore, Hq is not rejected in favor of H\ at level a = 0.05. 

7 Concluding remarks 

The problem of estimating parameters of the dynamically selected populations can be 
extended to the Bayesian context. Moreover, the problem of unbiased estimation of the 
selected parameters under other loss functions is of interest. The distributional models 
which are not members of studied families can be studied separately, specially the discrete 
distribution. Another problem is to find the two stage (conditionally) unbiased estimators 
of the parameters of the dynamically selected populations. These problems are treated in 
an upcoming work, to appear in subsequent papers. 
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